AITopics

Country:

North America > United States > Florida > Pinellas County > St. Petersburg (0.04)
North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
North America > United States > Arizona > Maricopa County > Phoenix (0.04)
(5 more...)

Genre:

Research Report > New Finding (0.46)
Instructional Material > Course Syllabus & Notes (0.46)

Technology:

Information Technology > Software Engineering (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Neural Information Processing SystemsNov-20-2025, 22:19:14 GMT

Learning Loop Invariants for Program Verification

The problem is undecidable and even practical instances are challenging. Inspired by how human experts construct loop invariants, we propose a reasoning framework Code2Inv that constructs the solution by multi-step decision making and querying an external program graph memory block. By training with reinforcement learning, Code2Inv captures rich program features and avoids the need for ground truth solutions as supervision. Compared to previous learning tasks in domains with graph-structured data, it addresses unique challenges, such as a binary objective function and an extremely sparse reward that is given by an automated theorem prover only after the complete loop invariant is proposed. We evaluate Code2Inv on a suite of 133 benchmark problems and compare it to three state-of-the-art systems. It solves 106 problems compared to 73 by a stochastic search-based system, 77 by a heuristic search-based system, and 100 by a decision tree learning-based system. Moreover, the strategy learned can be generalized to new programs: compared to solving new instances from scratch, the pre-trained agent is more sample efficient in finding solutions.

learning loop invariant, name change, program verification, (3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.97)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.83)

arXiv.org Artificial IntelligenceNov-18-2025

VeriStruct: AI-assisted Automated Verification of Data-Structure Modules in Verus

Sun, Chuyue, Sun, Yican, Amrollahi, Daneshvar, Zhang, Ethan, Lahiri, Shuvendu, Lu, Shan, Dill, David, Barrett, Clark

We introduce VeriStruct, a novel framework that extends AI-assisted automated verification from single functions to more complex data structure modules in Verus. VeriStruct employs a planner module to orchestrate the systematic generation of abstractions, type invariants, specifications, and proof code. To address the challenge that LLMs often misunderstand Verus' annotation syntax and verification-specific semantics, VeriStruct embeds syntax guidance within prompts and includes a repair stage to automatically correct annotation errors. In an evaluation on eleven Rust data structure modules, VeriStruct succeeds on ten of the eleven, successfully verifying 128 out of 129 functions (99.2%) in total. These results represent an important step toward the goal of automatic AI-assisted formal verification.

large language model, machine learning, programming language, (21 more...)

2510.25015

Country: North America > United States (0.68)

Genre: Research Report (0.64)

Industry: Information Technology (0.67)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(2 more...)

Neural Information Processing SystemsOct-10-2025, 20:11:49 GMT

Towards General Loop Invariant Generation: A Benchmark of Programs with Memory Manipulation

assertion, invariant, loop invariant, (13 more...)

Country:

North America > United States > Florida > Pinellas County > St. Petersburg (0.04)
North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
North America > United States > Arizona > Maricopa County > Phoenix (0.04)
(5 more...)

Genre:

Research Report > New Finding (0.46)
Instructional Material > Course Syllabus & Notes (0.46)

Technology:

Information Technology > Software Engineering (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Panigrahy, Rina, Sharan, Vatsal

Limitations on Safe, Trusted, Artificial General Intelligence

arXiv.org Artificial IntelligenceSep-29-2025

Safety, trust and Artificial General Intelligence (AGI) are aspirational goals in artificial intelligence (AI) systems, and there are several informal interpretations of these notions. In this paper, we propose strict, mathematical definitions of safety, trust, and AGI, and demonstrate a fundamental incompatibility between them. We define safety of a system as the property that it never makes any false claims, trust as the assumption that the system is safe, and AGI as the property of an AI system always matching or exceeding human capability. Our core finding is that -- for our formal definitions of these notions -- a safe and trusted AI system cannot be an AGI system: for such a safe, trusted system there are task instances which are easily and provably solvable by a human but not by the system. We note that we consider strict mathematical definitions of safety and trust, and it is possible for real-world deployments to instead rely on alternate, practical interpretations of these notions. We show our results for program verification, planning, and graph reachability. Our proofs draw parallels to Gödel's incompleteness theorems and Turing's proof of the undecidability of the halting problem, and can be regarded as interpretations of Gödel's and Turing's results.

ai system, machine learning, natural language, (15 more...)

2509.21654

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (0.34)

Industry:

Leisure & Entertainment > Games (0.46)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
(2 more...)

arXiv.org Artificial IntelligenceSep-29-2025

InvBench: Can LLMs Accelerate Program Verification with Invariant Synthesis?

Wei, Anjiang, Suresh, Tarun, Sun, Tianran, Wu, Haoze, Wang, Ke, Aiken, Alex

Program verification relies on loop invariants, yet automatically discovering strong invariants remains a long-standing challenge. We introduce a principled framework for evaluating LLMs on invariant synthesis. Our approach uses a verifier-based decision procedure with a formal soundness guarantee and assesses not only correctness but also the speedup that invariants provide in verification. We evaluate 7 state-of-the-art LLMs, and existing LLM-based verifiers against the traditional solver UAutomizer. While LLM-based verifiers represent a promising direction, they do not yet offer a significant advantage over UAutomizer. Model capability also proves critical, as shown by sharp differences in speedups across models, and our benchmark remains an open challenge for current LLMs. Finally, we show that supervised fine-tuning and Best-of-N sampling can improve performance: fine-tuning on 3589 instances raises the percentage of speedup cases for Qwen3-Coder-480B from 8% to 29.2%, and Best-of-N sampling with N=16 improves Claude-sonnet-4 from 8.8% to 22.1%.

invariant, large language model, machine learning, (19 more...)

2509.21629

Country:

Europe (0.93)
North America > United States (0.68)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Communications of the ACMMay-29-2025, 15:05:51 GMT

Technical Perspective: A Symbolic Approach to Verifying Quantum Systems

Exceptional added value may lie in connecting two complementary areas of computer science. This statement is particularly true when applying mature techniques developed in one area to solve complex problems that arise in a new area. The accompanying paper, "An Automata-Based Framework for Verification and Bug Hunting in Quantum Circuits" by Lengál et al., is a case in point. It applies techniques developed in logic, automata, and symbolic verification to analyze the correctness of quantum programs. The current quest of quantum computing is achieving quantum supremacy--that is, to reach the point where we solve problems that are practically unsolvable using conventional computing.

artificial intelligence, quantum circuit, verification, (14 more...)

Communications of the ACM

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.34)

arXiv.org Artificial IntelligenceFeb-7-2025

RAG-Verus: Repository-Level Program Verification with LLMs using Retrieval Augmented Generation

Zhong, Sicheng, Zhu, Jiading, Tian, Yifang, Si, Xujie

Scaling automated formal verification to real-world projects requires resolving cross-module dependencies and global contexts, which are challenges overlooked by existing function-centric methods. We introduce RagVerus, a framework that synergizes retrieval-augmented generation with context-aware prompting to automate proof synthesis for multi-module repositories, achieving a 27% relative improvement on our novel RepoVBench benchmark -- the first repository-level dataset for Verus with 383 proof completion tasks. RagVerus triples proof pass rates on existing benchmarks under constrained language model budgets, demonstrating a scalable and sample-efficient verification.

large language model, machine learning, natural language, (19 more...)

2502.05344

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceDec-13-2024

Enhancing Automated Loop Invariant Generation for Complex Programs with Large Language Models

Liu, Ruibang, Li, Guoqiang, Chen, Minyu, Wu, Ling-I, Ke, Jingyu

Automated program verification has always been an important component of building trustworthy software. While the analysis of real-world programs remains a theoretical challenge, the automation of loop invariant analysis has effectively resolved the problem. However, real-world programs that often mix complex data structures and control flows pose challenges to traditional loop invariant generation tools. To enhance the applicability of invariant generation techniques, we proposed ACInv, an Automated Complex program loop Invariant generation tool, which combines static analysis with Large Language Models (LLMs) to generate the proper loop invariants. We utilize static analysis to extract the necessary information for each loop and embed it into prompts for the LLM to generate invariants for each loop. Subsequently, we employ an LLM-based evaluator to assess the generated invariants, refining them by either strengthening, weakening, or rejecting them based on their correctness, ultimately obtaining enhanced invariants. We conducted experiments on ACInv, which showed that ACInv outperformed previous tools on data sets with data structures, and maintained similar performance to the state-of-the-art tool AutoSpec on numerical programs without data structures. For the total data set, ACInv can solve 21% more examples than AutoSpec and can generate reference data structure templates.

invariant, large language model, machine learning, (20 more...)

2412.10483

Country:

Europe > Austria > Vienna (0.14)
Europe > Switzerland (0.05)
North America > United States > New York > New York County > New York City (0.04)
(7 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Software (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Neural Information Processing SystemsOct-7-2024, 11:36:20 GMT

Reviews: Learning Loop Invariants for Program Verification

The paper presents a novel deep network architecture termed DELPHI to automatically infer loop invariants for use in program verification. The architecture takes in as input source code which has (1) a number of assumption or assignment statements, (2) a loop with nested if-else statements with arithmetic operations and (3) a final assertion statement. The output of the architecture is a loop invariant in CNF which holds true at every iteration in the loop, and for which the assertion (3) is true after the loop ends execution. The architecture represents the source code AST using a graph-structured neural network, and treats it as a structured memory which it repeatedly accesses through attention operations. The generation of the CNF invariant is broken up into a sequential decision-making process where at each time the architecture predicts an output (op, T), where op is either && or and T is a simple logical expression.

architecture, delphi, invariant, (11 more...)

Genre: Research Report > New Finding (0.31)

Technology:

Information Technology > Software Engineering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.37)